BMatch: A Quality/Performance Balanced Approach for Large Scale Schema Matching
نویسندگان
چکیده
Schema matching is a crucial task to gather information of the same domain. This is even more the case when dealing with data warehouses, where a large number of data sources are available and require matching and integration. However, the matching process is still largely performed manually or semiautomatically, thus discouraging the use of large scale integration systems. Indeed, these large scale scenarios require a solution which ensures both an acceptable matching quality and good performance. In this article, we present an approach to efficiently match a large number of schemas. The quality aspect is based on the combination of terminological methods and cosine measure between context vectors. The performance aspect relies on a B-tree indexing structure to reduce the search space. Finally, our approach has been implemented and experiments with real sets of schemas show that it is both scalable and provides an acceptable quality of matches as compared to results obtained by the most referenced schema matching tools.
منابع مشابه
BMatch: a Semantically Context-based Tool Enhanced by an Indexing Structure to Accelerate Schema Matching
Abstract. Schema matching is a crucial task to gather information of the same domain. This is more true on the web, where a large number of data sources are available and require to be matched. However, the schema matching process is still largely performed manually or semiautomatically, discouraging the deployment of large-scale integration systems. Indeed, these large-scale scenarios need a s...
متن کاملImproving quality and performance of schema matching in large scale
Schema matching is a crucial task to gather information of the same domain. However, this process is still largely performed manually or semi-automatically, discouraging the deployment of large-scale mediation systems. Indeed, these large-scale scenarii need a solution which ensures both an acceptable matching quality and good performance. In this article, we present the BMatch approach to effi...
متن کاملTowards a Generic Approach for Schema Matcher Selection: Leveraging User Pre- and Post-match Effort for Improving Quality and Time Performance JURY
Towards a Generic Approach for Schema Matcher Selection: Leveraging User Preand Post-match Effort for Improving Quality and Time Performance Interoperability between applications or bridges between data sources are required to allow optimal information exchanges. Yet, some processes needed to bring this integration cannot be fully automatized due to their complexity. One of these processes is c...
متن کاملAn Indexing Structure for Automatic Schema Matching
Querying semantically related data sources depends on the ability to map between their schemas. Unfortunately, in most cases matching between schema is still largely performed manually or semi-automatically. Consequently, the issue of finding semantic mappings became the principal bottleneck in the deployment of the mediation systems in large scale where the number of ontologies and or schemata...
متن کاملAn Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008